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1 6,855 J 14 iT Automated method and system for the detection of abnormalities in sonographic images 

2 6,853.952 iT Method and systems of enhancing the effectiveness and success of research and development 

3 6,851,604 iT Method and apparatus for providing price updates 

4 6,850,252 iT Intelligent electronic apphance system and method 

5 6,836,777 lTj System and method for constructing generic analytical database applications 

6 6,829,384 lIj Object finder for photographic images 

7 6,819,796 lT Method of and apparatus for segmenting a pixellated image 

8 6,789,069 lT Method for enhancing knowledge discovered from biological data using a learning machine 

9 6,760,715 lT Enhancing biological knowledge discovery using multiples support vector machines 

10 6,728,690 lT Classification system trainer employing maximum margin back-propagation with probabilistic outputs 

1 1 6,714,967 lIj Integration of a computer-based message priority system with mobile electronic devices 

12 6,714,925 lIj System for identifying patterns in biological data using a distributed network 

13 6,662,192 iT System and method for data collection, evaluation, information generation, and presentation 

14 6,658,396 lT Neural network drug dosage estimation 

15 6,658,395 ilj Enhancing knowledge discovery fi:'om multiple data sets using multiple support vector machines 

16 6,643,187 lTj Compressed event counting technique and application to a flash memory system 

17 6,633,857 iT Relevance vector machine 

18 6,625,315 iT Method and apparatus for identifying objects depicted in a videostream 

19 6,622,160 lTj Methods for routing items for communications based on a measure of criticality 

20 6,618,716 lT Computational architecture for managing the transmittal and rendering of information, alerts, and 

notifications 

21 6,601,055 iT Explanation generation system for a diagnosis support tool employing an inference system 

22 6,601,012 lT Contextual models and methods for inferring attention and location 
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23 6.594,584 IB Method for calculating a distance between a well logging instrument and a formation boundary by 

inversion processing measurements from the logging instrument 

24 6.591J46 M Method for leaming switching linear dynamic system models from data 

25 6.59K004 31 Sure-fit: an automated method for modeling the shape of cerebral cortex and other complex structures 

using customized filters and transformations 

26 6,553356 M Muhi-view computer-assisted diagnosis 

27 6,553,352 ii Interface for merchandise price optimization 

28 6,513,026 Hi Decision theoretic principles and poHcies for notification 

29 6,453,056 Mi Method and apparatus for generating a database of road sign images and positions 

30 6.449,384 Hi Method and apparatus for rapidly determining whether a digitized image frame contains an object of 

interest 

31 6,427,141 ii Enhancing knowledge discovery using multiple support vector machines 

32 6,377,640 ffi Means and method for a synchronous network communications system 

33 6,363,161 Hi System for automatically generating database of objects of interest by analysis of images recorded by 

moving vehicle 

34 6.345,001 il Compressed event counting technique and application to a flash memory system 

35 6,327,581 ill Methods and apparatus for building a support vector machine classifier 

36 6.266,442 31 Method and apparatus for identifying objects depicted in a videostream 

37 6,219,626 Hi Automated diagnostic system 

38 6,161,130 ill Technique which utilizes a probabilistic classifier to detect "junk" e-mail by automatically updating a 

training and re-training the classifier based on the updated training set 

39 6,157.921 ii Enhancing knowledge discovery using support vector machines in a distributed network environment 

40 6.128.608 ii Enhancing knowledge discovery using multiple support vector machines 

41 6.056.690 Hi Method of diagnosing breast cancer 

42 6.031,935 IB Method and apparatus for segmenting images using constant-time deformable contours 

43 6,004,267 ii Method for diagnosing and staging prostate cancer 

44 5,964,700 ii Medical network management article of manufacture 

45 5.764.923 il Medical network management system and process 

46 5,764,515 ii Method for predicting, by means of an inversion technique, the evolution of the production of an 

underground reservoir 

47 5.487.133 ii Distance calculating neural network classifier chip and system 

48 5,301,317 ffll System for adapting query optimization effort to expected execution time 

49 5,296,861 31 Method and apparatus for maximum likelihood estimation direct integer search in differential carrier 

phase attitude determination systems 

50 5,271,088 PI Automated sorting of voice messages through speaker spottin g 
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'I Attention and integration: Learning and reasonin g about interruption 
Eric Horvltz, Johnson Apacible 

November 2003 Proceedings of the 5th international conference on i^ultimodal 
interfaces 

Full text available: ^ pdf(1.07 MB) Additional Infornnation: full citation , abstract , references , index terms 

We present methods for inferring the cost of interrupting users based on multiple streams 
of events including information generated by interactions with computing devices, visual 
and acoustical analyses, and data drawn from online calendars. Following a review of prior 
work on techniques for deliberating about the cost of interruption associated with 
notifications, we introduce methods for learning models from data that can be used to 
compute the expected cost of interruption for a user. We desc ... 

Keywords: cognitive models, divided attention, interruption, notifications 



2 Exact Bayesian Structure Discovery in Bayesian Netwo rks J 
Mikko Koivisto, Kismat Sood 

August 2004 The Journal of Machine Learning Research, volume 5 

Full text available: ^ pdf(261.89 KB) Additional Infornnation: full citation , abstract , index terms 

Learning a Bayesian network structure from data is a well-motivated but computationally 
hard task. We present an algorithm that computes the exact posterior probability of a 
subnetwork, e.g., a directed edge; a modified version of the algorithm finds one of the most 
probable network structures. This algorithm runs in time 0(a? 2" + n'^+^CCm)), where n is the 
number of network variables, k \sa constant maximum in- ... 



3 Text summarization via hidden Markov models 
John M. Conroy, DIanne P. O'leary 



September 2001 Proceedings of the 24th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Additional Infornnation: full citation , abstract , references , citings , index 
terms 



Full text available:^ pdf(1 33. 10 KB) 



A sentence extract summary of a document Is a subset of the document's sentences that 
contains the main ideas In the document. We present an approach to generating such 
summaries, a hidden Markov model that judges the likelihood that each sentence should be 
contained in the summary. We compare the results of this method with summaries 
generated by humans, showing that we obtain significantly higher agreement than do 
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earlier methods. 

Keywords: automatic summarization, document summarization, extract summaries, 
hidden Markov models, text summarization 



^ Envy-free auctions for di g ital goods I 
Andrew V. Goldberg, Jason D. Hartline 

June 2003 Proceedings of the 4th ACM conference on Electronic commerce 

Full text available: ^ pdf(1 69.88 KB) Additional Information: full citation , abstract , references , index terms 

We study auctions for a commodity in unlimited supply, e.g., a digital good. In particular we 
consider three desirable properties for auctions: item Competitive: the auction achieves a 
constant fraction of the optimal revenue even on worst case inputs, item Truthful: any 
bidder's best strategy Is to bid the maximum value they are willing to pay. item Envy-free: 
after the auction is run, no bidder would be happier with the outcome of another bidder (for 
digital good auctions, this means that ther ... 

Keywords: auctions, competitive analysis 



Predictive automatic relevance deternnination by expectation p ropagation 
Yuan (Alan) Qi, Thomas P. Minka, Rosalind W. Picard, Zoubin Ghahramani 
July 2004 Twenty-first international conference on Machine learning 

Full text available: ^ pdf(314.64 KB) Additional Information: full citation , abstract , references 

In many real-world classification problems the input contains a large number of potentially 
irrelevant features. This paper proposes a new Bayesian framework for determining the 
relevance of input features. This approach extends one of the most successful Bayesian 
methods for feature selection and sparse learning, known as Automatic Relevance 
Determination (ARD). ARD finds the relevance of features by optimizing the model marginal 
likelihood, also known as the evidence. We show that this can lea ... 

Research track papers: Interestingness of frequent itemsets using Bayesian networks 

as background knowledge 
Szymon Jaroszewicz, Dan A. Simovici 

August 2004 Proceedings of the 2004 ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: ^ pdf(191.90 KB) Additional Information: full citation , abstract , references , index terms 

The paper presents a method for pruning frequent itemsets based on background 
knowledge represented by a Bayesian network. The interestingness of an itemset is defined 
as the absolute difference between its support estimated from data and from the Bayesian 
network. Efficient algorithms are presented for finding interestingness of a collection of 
frequent itemsets, and for finding all attribute sets with a given minimum interestingness. 
Practical usefulness of the algorithms and their efficiency ... 

Keywords: Bayesian network, association rule, background, frequent itemset, 
interestingness, knowledge 



^ Measurements and testbeds: A framework for interpretin g nneasurennent over Internet ^ 
Kave Salamatian, Serge Fdida 

August 2003 Proceedings of the ACM SIGCOMM workshop on Models, methods and 
tools for reproducible network research 

Full text available: ^ pdf(352.04 KB) Additional Information: full citation , abstract , references 
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This paper introduces a methodology for interpreting measurement obtained over Internet. 
The paper is motivated by the fact that a large number of published papers in empirical 
networking analysis follow a generic framework that might be formalized and generalized to 
a large class of problem. The objective of this paper is to present an interpretation 
framework and to illustrate it by examples coming from the networking literature. The aim 
of the paper is rather to give to the researcher who is ... 

Keywords: Internet, Interpretation, Measurement, modelling 



^ Learning with mixtures of trees 
Marina Meila, Michael I. Jordan 

September 2001 The Journal of Machine Learning Research, volume i 
Full text available: ^ pdf(400.02 KB) Additional Information: full citation , abstract 

This paper describes the mixtures-of-trees model, a probabilistic model for discrete 
multidimensional domains. Mixtures-of-trees generalize the probabilistic trees of Chow and 
Liu (1968) in a different and complementary direction to that of Bayesian networks. We 
present efficient algorithms for learning mixtures-of-trees models in maximum likelihood 
and Bayesian frameworks. We also discuss additional efficiencies that can be obtained when 
data are "sparse," and we present data structures and alg ... 

9 Ap proximatelv-strateqyproof and tractable multi-unit auctions 
Anshul Kothar, David C. Parke, Subhash Sur 

June 2003 Proceedings of the 4th ACi^ conference on Electronic commerce 

r- n . ^ •. u. 01 ^*/ono oc AdditlonBl Information: full citation , abstract , references , citings , index 
Full text available: ^ pdf(302.85 KB) terns 

We present an approximately-efficient and approximately-strategyproof auction mechanisnn 
for a single-good multi-unit allocation problem. The bidding language in our auctions allows 
marginal-decreasing piecewise constant curves. First, we develop a fully polynomial-time 
approximation scheme for the multi-unit allocation problem, which computes a (l+e)^ in 
worst-case time T = O(nVE), given n bids each with a constant number of pieces. Second, 
we embed this approximation ... 

Keywords: approximation algorithm, multi-unit auctions, strategyproof 

C ompetitive sol utions for online financial problems 
Ran El-Yaniv 

March 1998 ACM Computing Surveys (CSUR), volume 30 issue i 

.- n . -X u. 01 ^f/oo-. oo i^Dx Additional Information: full citation , abstract , references , citings, index 
Full text available: g pdf(331.62 KB) terms 

This article surveys results concerning online algorihtms for solving problems related to the 
management of money and other assets. In particular, the survey focucus us search, 
replacement, and portfolio selection problems 

'•I Parallel logic simulation of VLSI systems 

Mary L Bailey, Jack V. Briner, Roger D. Chamberlain 

September 1994 ACM Computing Surveys (CSUR), volume 26 issue 3 

^ u. 01 ^fi^-fAKAD\ Additional Information: full citation , abstract, references, citings. Index 
Full text available: TO pdf(3.74 MB ) 

terms 

Fast, efficient logic simulators are an essential tool in modern VLSI system design. Logic 
simulation is used extensively for design verification prior to fabrication, and as VLSI 
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systems grow in size, the execution time required by simulation is becoming more and 
more significant. Faster logic simulators will have an appreciable economic impact, speeding 
time to market while ensuring more thorough system design testing. One approach to this 
problem is to utilize parallel processing, taking ... 

Keywords: circuit structure, parallel architecture, parallelism, partitioning, synchronization 
algorithm, timing granularity 



Bayes point machines 

Ralf Herbrich, Thore Graepel, Colin Campbell 

September 2001 The Journal of Machine Learning Research, volume i 
Full text available: 'g pdf(1.02 MB) Additional Information: full citation , abstract 

Kernel-classifiers comprise a powerful class of non-linear decision functions for binary 
classification. The support vector machine is an example of a learning algorithm for kernel 
classifiers that singles out the consistent classifier with the largest margin, i.e. minimal 
real-valued output on the training sample, within the set of consistent hypotheses, the so- 
called version space. We suggest the Bayes point machine as a well-founded improvement 
which approximates the Bayes-optim ... 

3 S parse bayesian learning and the relevance vector machine 
Michael E. Tipping 

Septennber2001 The Journal of Machine Learning Research, volume i 
Full text available: ^ pdf(999.88 KB) Additional Information: full citation , abstract 

This paper introduces a general Bayesian framework for obtaining sparse solutions to 
regression and classification tasks utilising nnodels linear in the parameters. Although this 
framework is fully general, we illustrate our approach with a particular specialisation that 
we denote the 'relevance vector nnachine' (RVM), a model of identical functional form to the 
popular and state-of-the-art 'support vector nnachine' (SVM). We demonstrate that by 
exploiting a probabilistic Bayesian learning framewor ... 

'1 4 Context-specific Bayesian clusterin g for gene ex pr ession data 
Yoseph Barash, Nir Friedman 

April 2001 Proceedings of the fifth annual international conference on Computational 



The recent growth in genomic data and measurement of genome-wide expression patterns 
allows to examine gene regulation by transcription factors using computational tools. In this 
work, we present a class of mathematical models that help in understanding the 
connections between transcription factors and functional classes of genes based on genetic 
and genomic data. These models represent the joint distribution of transcription factor 
binding sites and of expression levels of a gene in a single ... 

Cost-benefit methodology for office systems 
Peter G. Sassone 

July 1987 ACM Transactions on Information Systems (TOIS), volume 5 issue 3 



The time savings times salary (TSTS) approach is a widely used methodology for the 
financial justification of office information systems, yet its theoretical basis is largely 
unexplored. In this paper, we identify its underlying economic model, including five critical 
assumptions. We find that the model, though somewhat restrictive, is not unreasonable. 



biology 

Full text available: pdf(233.32 KB ) 



Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available:^ Pdfd. 27 MB) 



Additional Information: full citation , abstract , references , citings, index 
terms 
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However, we find that the time-saving-times-salary calculation, per se, is implicitly based 
on a very particular assumptio ... 

^6 Software nnetrics: roadmap H 
Norman E. Fenton, Martin Neil 

May 2000 Proceedings of the Conference on The Future of Software Engineering 

Full text available: 'Q pdf(1.25 MB) Additional Information: full citation , references , citings , index terms 



Keywords: Bayesian belief nets, casual models, multi-criteria decision aid, risk 
assessment, software metrics 



^7 Case-factor diagrams for structured probabilistic modeling Q 
David McAllester, Michael Collins, Fernando Pereira 

July 2004 Proceedings of the 20th conference on Uncertainty in artificial intelligence 

Full text available: ^ pdf(405.79 KB) Additional Information: full citation , abstract, references 

We Introduce a probabilistic formalism subsuming Markov random fields of bounded tree 
width and probabilistic context free grammars. Our models are based on a representation of 
Boolean formulas that we call case-factor diagrams (CFDs). CFDs are similar to binary 
decision diagrams (BDDs) but are concise for circuits of bounded tree width (unlike BDDs) 
and can concisely represent the set of parse trees over a given string under a given context 
free grammar (also unlike BDDs). A probabilistic mo ... 

^ ^ Coalition formation : Coal iti on form a tion with uncertain heterogeneous information Q 
Sarit Kraus, Onn Shehory, Gilad Taase 

July 2003 Proceedings of the second international joint conference on Autonomous 
agents and multiagent systems 

Full text available: ^ pdf(245.62 KB) Additional Information: full citation , abstract , references , index terms 

Coalition formation methods allow agents to join together and are thus necessary in cases 
where tasks can only be performed cooperatively by groups. This is the case in the Request 
For Proposal (RFP) domain, where some requester business agent issues an RFP - a 
complex task comprised of sub-tasks - and several service provider agents need to join 
together to address this RFP. In such environments the value of the RFP may be common 
knowledge, however the costs that an agent incurs for performing ... 

Keywords: RFP, coalition formation, experimentation, incomplete information, task 
allocation 



19 Text categorization: Using asymmetric distributions to improve text classifier probability Q 

estimates 
Paul N. Bennett 

July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 
Research and development in informaion retrieval 

Full text available: Q pdf(281 .97 KB) Additional Information: full citation , abstract , references , index terms 

Text classifiers that give probability estimates are more readily applicable in a variety of 
scenarios. For example, rather than choosing one set decision threshold, they can be used 
in a Bayesian risk model to issue a run-time decision which minimizes a user-specified cost 
function dynamically chosen at prediction time. However, the quality of the probability 
estimates is crucial. We review a variety of standard approaches to converting scores (and 
poor probability estimates) from text classifi ... 
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Keywords: active learning, classifier combination, cost-sensitive learning, text 
classification 



20 When ig norance is bliss H 
Peter D. Grunwald, Joseph Y. Halpern 

July 2004 Proceedings of the 20th conference on Uncertainty in artificial intelligence 

Full text available: ^ pdf(333.33 KB ) Additional Information: full citation , abstract , references 

It is commonly-accepted wisdom that more information is better, and that information 
should never be ignored. Here we argue, using both a Bayesian and a non-Bayesian 
analysis, that in some situations you are better off ignoring information if your uncertainty 
is represented by a set of probability measures. These include situations in which the 
information <i>is</i> relevant for the prediction task at hand. In the non-Bayesian 
analysis, we show how ignoring information avoids <i>d ... 
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A new characterization of probabilities in Bavesian networks 
Lenhart K. Schubert 

July 2004 Proceedings of the 20th conference on Uncertainty in artificial intelligence 

Full text available: ^ pdf (465.79 KB ) Additional Information: full citation , abstract , references 

We characterize probabilities in Bayesian networks in terms of algebraic expressions called 
quasi-probabilities. These are arrived at by casting Bayesian networks as noisy AND-OR- 
NOT networks, and viewing the subnetworks that lead to a node as arguments for or 
against a node. Quasi-probabilities are in a sense the "natural" algebra of Bayesian 
networks: we can easily compute the marginal quasi-probability of any node recursively, in 
a compact form; and we can obtain the joint quasi-probabilit ... 



22 Cost benefit analysis of information systems: a survey of methodologies 
Peter G. Sassone 

April 1988 ACM SIGOIS Bulletin , Conference Sponsored by ACM SIGOIS and lEEECS 
TC-OA on Office information systems, volume 9 issue 2-3 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available:^ pdf(999.47 KB) 



Cost justification has become one of the most important factors influencing the pace of 
business automation, particularly end user computing. The primary difficulty In cost 
justification is the evaluation of benefits. This paper identifies and discusses eight 
methodologies which have evolved to quantify the benefits of Information systems. These 
are: decision analysis, cost displacement/avoidance, structural models, cost of effectiveness 
analysis, breakeven analysis, subjective anal ... 

23 Industry/government track papers: Learning to detect malicious executables in the wild Q 
Jeremy Z. Kolter, Marcus A. Maloof 

August 2004 Proceedings of the 2004 ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: gpdf(216.52 KB) Additional Information: full citation , abstract , references, index ter ms 

In this paper, we describe the development of a fielded application for detecting malicious 
executables in the wild. We gathered 1971 benign and 1651 malicious executables and 
encoded each as a training example using n-grams of byte codes as features. Such 
processing resulted in more than 255 million distinct n-grams. After selecting the most 
relevant n-grams for prediction, we evaluated a variety of inductive methods, including 
naive Bayes, decision trees, support vector machines, and boosting. ... 
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Keywords: concept learning, data mining, malicious software, security 

24 On inclusion-driven learning of bavesian networks Q 
Robert Castelo, Tomaas Kocka 

December 2003 The Journal of Machine Learning Research, volume 4 

Full text available: ^ pdf(980.59 KB) Additional Information: full citation , abstract , references , index terms 

Two or more Bayesian network structures are Markov equivalent when the corresponding 
acyclic digraphs encode the same set of conditional independencies. Therefore, the search 
space of Bayesian network structures may be organized in equivalence classes, where each 
of them represents a different set of conditional independencies. The collection of sets of 
conditional independencies obeys a partial order, the so-called "inclusion order." This paper 
discusses in depth the role that the inclusion ord ... 

25 KDD-99 conference reports: Profiling vour customers using Bayesian networks Q 
Paola Sebastiani, Marco Ramoni, Alexander Crea 

January 2000 ACM SIGKDD Explorations Newsletter volume i issue 2 

Full text available: ^ pdf(1.22 MB) Additional Information: full citation , abstract 

This report describes a complete Knowledge Discovery session using Bayesware Discoverer, 
a program for the induction of Bayesian networks from incomplete data. We build two 
causal models to help an American Charitable Organization understand the characteristics 
of respondents to direct mail fund raising cannpaigns. The first model is a Bayesian network 
induced from the database of 96,376 Lapsed donors to the June '97 renewal mailing. The 
network describes the dependency of the probability of resp ... 

Keywords: Bayesian networks, customer profiling, missing data 



26 An evaluation of statistical spam filterin g technigues Q 
Le Zhang, Jingbo Zhu, Tianshun Yao 

December 2004 ACM Transactions on Asian Language Information Processing (TALIP), 

Volume 3 Issue 4 

Full text available: ^ pdf(343.64 KB) Additional Information: full citation , abstract , references , index terms 

This paper evaluates five supervised learning methods in the context of statistical spam 
filtering. We study the impact of different feature pruning methods and feature set sizes on 
each learner*s performance using cost-sensitive measures. It is observed that the 
significance of feature selection varies greatly from classifier to classifier. In particular, we 
found support vector machine, AdaBoost, and maximum entropy model are top performers 
in this evaluation, sharing similar characteristics: ... 

Keywords: Spam filtering, text categorization 

27 An empirical evaluation of possible variations of lazy propagation Q 
Anders L. Madsen 

July 2004 Proceedings of the 20th conference on Uncertainty in artificial intelligence 

Full text available: ^ pdf(383.90 KB) Additional Information: full citation , abstract , references 

As real-world Bayesian networks continue to grow larger and more complex, it is important 
to investigate the possibilities for improving the performance of existing algorithms of 
probabilistic inference. Motivated by examples, we investigate the dependency of the 
performance of Lazy propagation on the message computation algorithm. 
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We show how Symbolic Probabilistic Inference (SPI) and Arc-Reversal (AR) can be used for 
computation of clique to clique messages in the addition to the tra ... 

28 Propositional and relational Bayesian networks associated with imprecise and 
qualitative probabilistic assessments 

Fabio Gagliardi Cozman, Cassio Polpo de Campos, Jaime Shinsuke Ide, Jose Carlos Ferreira da 
Rocha 

July 2004 Proceedings of the 20th conference on Uncertainty in artificial intelligence 

Full text available: ^pdf(340.75 KB) Additional Information: fuli citation , abstract , references 

This paper investigates a representation language with flexibility inspired by probabilistic 
logic and compactness inspired by relational Bayesian networks. The goal is to handle 
propositional and first-order constructs together with precise, imprecise, indeterminate and 
qualitative probabilistic assessments. The paper shows how this can be achieved through 
the theory of credal networks. New exact and approximate inference algorithms based on 
multilinear programming and iterated/loopy propaga ... 

29 Fast Binary Feature Selection with Conditional Mutual Information 
Frangois Fleuret 

December 2004 The Journal of Machine Learning Research, volume 5 
Full text available: g pdf(211.52 KB) Additional Information: full citation , abstract 

We propose in this paper a very fast feature selection technique based on conditional 
mutual information. By picking features which maximize their mutual information with the 
class to predict conditional to any feature already picked, it ensures the selection of 
features which are both individually informative and two-by-two weakly dependant. We 
show that this feature selection method outperforms other classical algorithms, and that a 
naive Bayesian classifier built with features selected that w ... 

30 Learning Bayesian network classifiers by maxinnizing conditional likelihood 
Daniel Grossman, Pedro Domingos 

July 2004 Twenty-first international conference on Machine learning 

Full text available: ^ pdf(1 87.23 KB) Additional Information: full citation , abstract , references 

Bayesian networks are a powerful probabilistic representation, and their use for 
classification has received considerable attention. However, they tend to perform poorly 
when learned in the standard way. This is attributable to a mismatch between the objective 
function used (likelihood or a function thereof) and the goal of classification (maximizing 
accuracy or conditional likelihood). Unfortunately, the computational cost of optimizing 
structure and parameters for conditional likelihood is pro ... 

A pro posal for valuin g information and instrumental goods 
Marshall V. Van Alstyne 

January 1999 Proceeding of the 20th international conference on Information Systems 

Full text available: ^ pdf (405.51 KB) Additional Information: full citation , references. Index terms 



32 Comparison of Bayesian and frequentist assessments of uncertainty for selecting the Q 
best system 

Koichiro Inoue, Stephen E. Chick 

December 1998 Proceedings of the 30th conference on Winter simulation 

Full text available: ^ pdf (88.32 KB) Additional Information: full citation , references , citings, index terms 
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33 Common knowledge 
John Geanakoplos 

March 1992 Proceedings of the 4th conference on Theoretical aspects of reasoning 
about knowledge 

Full text available: ^ pdf(3.29 MB) Additional Information: full citation , abstract , references 

People, no matter how rational they are, usually act on the basis of inconnplete information. 
If they are rational they recognize their own ignorance and reflect carefully on what they 
know and what they do not know, before choosing how to act. Furthermore, when rational 
agents interact, they also think about what the others know, and what the others know 
about what they know, before choosing how to act. Falling to do so can be disastrous. 
When the notorious evil genius Professor Moriarty conf ... 

34 Research track papers: A Bavesian network framework for reject inference 
Andrew Smith, Charles Elkan 

August 2004 Proceedings of the 2004 ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: ^ pdf(201.00 KB) Additional Information: full citation , abstract , references, index terms 

Most learning methods assume that the training set is drawn randomly from the population 
to which the learned model is to be applied. However in many applications this assumption 
is invalid. For example, lending institutions create models of who is likely to repay a loan 
from training sets consisting of people in their records to whom loans were given in the 
past; however, the institution approved loan applications previously based on who was 
thought unlikely to default. Learning from only appro ... 

Keywords: Bayesian networks, Heckman estimator, expectation-maximization, propensity 
scores, reject inference, sample selection bias 



35 Queries and a ggreg ation : Clea ning and quer ying noisy sensors ■ 
Eiman EInahrawy, Badri Nath 

September 2003 Proceedings of the 2nd ACM international conference on Wireless 

sensor networks and applications 

.-MX. u. 0 ^«oco no i^Dx Additional Information: f ull citation , abstract , references, citings, index 
Full text available: ^ p d f ( 256.08 KB) 

Sensor networks have become an important source of data with numerous applications in 
monitoring various real-life phenomena as well as industrial applications and traffic control. 
Unfortunately, sensor data is subject to several sources of errors such as noise from 
external sources, hardware noise, inaccuracies and imprecision, and various environmental 
effects. Such errors may seriously impact the answer to any query posed to the sensors. In 
particular, they may yield imprecise or even incorre ... 

Keywords: bayesian theory, noisy sensors, query evaluation, statistics, uncertainty, 
wireless sensor networks 

36 An algorithm for the recovery of both target joint beliefs and full belief from Bayesian | 

networks 
Mark Bloemeke 

April 1998 Proceedings of the 36th annual Southeast regional conference 

Full text available: ^ pdf(635.94 KB) Additional Information: full citation , references , index terms 
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37 Pac-bayesian generalisation error bounds for g aussian process classification Q 
Matthias Seeger 

March 2003 The Journal of Machine Learning Research, volume 3 

Full text available: Q pdf(487.11 KB) Additional Information: full citation , abstract , references , index terms 

Approximate Bayesian Gaussian process (GP) classification techniques are powerful non- 
parametric learning methods, similar in appearance and performance to support vector 
machines. Based on simple probabilistic models, they render interpretabie results and can 
be embedded in Bayesian frameworks for model selection, feature selection, etc. In this 
paper, by applying the PAC-Bayesian theorem of McAllester (1999a), we prove distribution- 
free generalisation error bounds for a wide range of approxima ... 

Keywords: Bayesian learning, Gaussian processes, Gibbs classifier, Kernel machines, PAC- 
Bayesian framework, convex duality, generalisation error bounds, sparse approximations 



38 Tractable learning of large Bayes net structures from sparse data Q 
Anna Goldenberg, Andrew Moore: 

July 2004 Twenty-first international conference on Machine learning 

Full text available: ^ pdf(127.86 KB) Additional Information: full citation , abstract , references 

This paper addresses three questions. Is it useful to attempt to learn a Bayesian network 
structure with hundreds of thousands of nodes? How should such structure search proceed 
practically? The third question arises out of our approach to the second: how can Frequent 
Sets (Agrawal et al., 1993), which are extremely popular in the area of descriptive data 
mining, be turned into a probabilistic model?Large sparse datasets with hundreds of 
thousands of records and attributes appear in social netwo ... 

Keywords: Bayes Net structure learning, Bayesian networks/graphical models, statistical 
learning 



39 Poster p a pers: Transforming classifier scores into accurate multiclass probability 
estimates 

Bianca Zadrozny, Charles Elkan 

July 2002 Proceedings of the eighth ACM SIGKDD international conference on 
Knowledge discovery and data mining 

Full text available: 'g pdf(690.25 KB) Additional Information: full citation , abstract, references , index terms 

Class membership probability estimates are important for many applications of data mining 
in which classification outputs are combined with other sources of information for decision- 
making, such as example-dependent misclassiflcation costs, the outputs of other classifiers, 
or domain knowledge. Previous calibration methods apply only to two-class problems. Here, 
we show how to obtain accurate probability estimates for multiclass problems by combining 
calibrated binary probability estimates. We a ... 

40 Model Avera g in g for Prediction with Discrete Bayesian Networks Q 
Denver Dash, Gregory F. Cooper 

December 2004 The Journal of Machine Learning Research, volume 5 
Full text available: ^pdf (267.17 KB ) Additional Information: full citation , abstract 

In this paper we consider the problem of performing Bayesian model-averaging over a class 
of discrete Bayesian network structures consistent with a partial ordering and with bounded 
in-degree k. We show that for N nodes this class contains in the worst-case at least <img 
align=middle src=dash04a-omega.jpeg alt="omega eq"> distinct network structures, and 
yet model averaging over these structures can be performed using <img allgn=middle 
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41 Boosting as a Re g ularized Path to a Maximum Mar g in Classifier 
Saharon Rosset, Ji Zhu, Trevor Hastie 

August 2004 The Journal of Machine Learning Research, volume 5 
Full text available: ^ pd f(553.71 KB ) Additional Information: full citation , abstract 

In this paper we study boosting methods from a new perspective. We build on recent work 
by Efron et al. to show that boosting approximately (and in some cases exactly) minimizes 
its loss criterion with an 1^ constraint on the coefficient vector. This helps understand the 

success of boosting with early stopping as regularized fitting of the loss criterion. For the 
two most commonly used criteria (exponential and binomial log-likelihood), we further show 
that as the constraint is ... 



42 Book reviews 
Karen Sutherland 

June 2001 intelligence, volume 12 issue 2 

Full text available: g pdf(358.84 KB) 
[ghtmlf 41.71 KB ) 



Additional Information: full citation , references , index terms 



43 Special issue on the fusion of domain knowledge with data for decision support: Fusion 
of domain knowledge with data for structural learnin g in object oriented domains 



Helge Langseth, Thomas D. Nielsen 
Decennber 2003 The Journal of Machine Learning Research, volume 4 

Additional Information: full citation , abstract , references , index terms . 
review 



Full text available:^ pdf(227.18 KB) 



When constructing a Bayesian network, it can be advantageous to employ structural 
learning algorithms to combine knowledge captured in databases with prior information 
provided by domain experts. Unfortunately, conventional learning algorithms do not easily 
incorporate prior information, if this information is too vague to be encoded as properties 
that are local to families of variables. For instance, conventional algorithms do not exploit 
prior information about repetitive structures, which are ... 

44 On linear potential functions for a p proximatin g Bavesian computations 
Eugene Santos 

May 1996 Journal of the ACM (3ACM), volume 43 issue 3 
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Full text available: ^ pdf(1.95MB) Additional Information: full citation , abstract , references , index terms . 

review 

Probabilistic reasoning suffers fronn NP-hard implementations. In particular, the amount of 
probabilistic information necessary to the computations is often overwhelming. For 
example, the size of conditional probability tables in Bayesian networks has long been a 
limiting factor in the general use of these networks. We present a new approach for 
manipulating the probabilistic information given. This approach avoids being overwhelmed 
by essentially compressing the information using ... 

Keywords: artificial intelligence, data compaction and compression, integer programming, 
least squares approximation, pattern recognition, probabilistic reasoning, uncertainty 



45 Learning and evaluatin g classifiers under sample selection bias 
Bianca Zadrozny 

July 2004 Twenty-first international conference on Machine learning 

Full text available: ^ pdf(243.73 KB) * Additional Information: full citation , abstract , references , citings 

Classifier learning methods commonly assume that the training data consist of randomly 
drawn examples from the same distribution as the test examples about which the learned 
model is expected to make predictions. In many practical situations, however, this 
assumption is violated, in a problem known in econometrics as sample selection bias. In 
this paper, we formalize the sample selection bias problem in machine learning terms and 
s;tudy analytically and experimentally how a number of well-known c ... 

M ini-buckets: A g eneral scheme for bounded inference 
Rlna Dechter, Irina Rish 

March 2003 Journal of the ACM (JACM), volume 50 issue 2 

.- .. X ^ -I u. 01 ^*/nno o7 Additional Information: full citation , abstract , references , citings, index 

Full text available: pdf(902.27 KB ) 

terms 

This article presents a class of approximation algorithms that extend the idea of bounded- 
complexity inference, inspired by successful constraint propagation algorithms, to 
probabilistic inference and combinatorial optimization. The idea is to bound the 
dimensionality of dependencies created by inference algorithms. This yields a 
parameterized scheme, called mini-buckets, that offers adjustable trade-off between 
accuracy and efficiency. The mini-bucket approach to optimization problems, s ... 

Keywords: Accuracy/complexity trade-off, Bayesian networks, approximation algorithms, 
combinatorial optimization, probabilistic inference. 



47 Oral presentation session IV: estimation and detec ti on: Distributed state repr esentat ion Q 
for trackin g problems in sensor networks 
Juan Liu, Maurice Chu, Jie Liu, Jim Reich, Feng Zhao 

April 2004 Proceedings of the third international symposium on Information 
processing in sensor networlcs 

Full text available: ^ pdff266.46 KB) Additional Information: full citation , abstract , references , index terms 

This paper investigates the problem of designing decentralized representations to support 
monitoring and inferences in sensor networks. State-space models of physical phenomena 
such as those arising from tracking multiple interacting targets, while commonly used in 
signal processing and control, suffer from the curse of dimensionality as the number of 
phenomena of interest increases. Furthermore, mapping an inference algorithm onto a 
distributed sensor network must appropriately allocate scarce ... 



http://portaLacrn.org/resultsxfhi?query=Bayesian%20and%20cost%20and%20mar^ 3/4/05 



Results (page 3): Bayesian and cost and margin 



Page 3 of 6 



Keywords: ad hoc network, group collaboration, Information, multi-target tracking, sensor 
network, target localization 



48 Technical poster session 1 : nnultimedia analysis, processing, and retrieval: A semi- 
na'ive Bayesian method incorporating clustering with pair-wise constraints for auto 

image annotation 

Wanjun Jin, Rui Shi, Tat-Seng Chua 

October 2004 Proceedings of the 12th annual ACM international conference on 
Multimedia 

Full text available: ^ pdf(258.93 KB) Additional Information: full citation , abstract , references , index terms 

We propose a novel approach for auto Image annotation. In our approach, we first perfornn 
the segmentation of images Into regions, followed by clustering of regions, before learning 
the relationship between concepts and region clusters using the set of training images with 
pre-assigned concepts. The main focus of this paper Is two-fold. First, In the learning stage, 
we perform clustering of regions into region clusters by incorporating pair-wise constraints 
which are derived by considering the ... 

Keywords: image annotation, pair-wise constraint, semi-naive Bayes, semi-supervised 
clustering 



^9 Greedy algorithms for classification— consistency, convergence rates, and adaptivity Q 
Shie Mannor, Ron Meir, Tong Zhang 

December 2003 The Journal of Machine Learning Research, volume 4 

Full text available: ^ pdf(269.33 KB) Additional Information: full citation , abstract , references , index terms 

Many regression and classification algorithnns proposed over the years can be described as 
greedy procedures for the stagewise minimization of an appropriate cost function. Some 
examples include additive models, matching pursuit, and boosting. In this work we focus on 
the classification problem, for which many recent algorithms have been proposed and 
applied successfully. For a specific regularized form of greedy stagewise optimization, we 
prove consistency of the approach under rather general co ... 

50 Advanced tutorials: Bayesian methods: bayesian methods for simulation Q 
Stephen E. Chick 

December 2000 Proceedings of the 32nd conference on Winter simulation 

Full text available: ^ pclfd 13.00 KB) Additional Information: full citation , abstract , references , citings 

This tutorial describes sonne ways that Bayesian nnethods address problems that arise 
during simulation studies. This includes quantifying uncertainty about input distributions 
and parameters, sensitivity analysis, and the selection of the best of several simulated 
alternatives. Focus Is on illustrating the main ideas and their relevance to practical 
problems. Numerous citations for both introductory and more advanced material provide a 
launching pad into the Bayesian literature. 

51 Mechanisms for coalition formation and cost sharing in an electronic marketplace Q 
Cuihong LI, Uday Rajan, Shuchi Chawla, Katia Sycara 

September 2003 Proceedings of the 5th international conference on Electronic 
commerce 

Full text available: Q pdf(237.54 KB) Additional Information: full citation , abstract , references 

In this paper we study the nnechanism design problenn of coalition fornnation and cost 
sharing In an electronic marketplace, where buyers can form coalitions to take advantage of 
discounts based on volume. The desirable mechanism properties include stability (being in 



http://poital.acm.org/resultsxfm?query=Bayesian%20and%20cost%20and%20m 3/4/05 



Results (page 3): Bayesian and cost and margin 



Page 4 of 6 



the core), and incentive compatibility with good eficlency, concepts from the perspectives of 
cooperative and non-cooperative game theory. We first analyze the problem from both 
these perspectives. We show the impossibility to simulta ... 

52 Sequential allocations that reduce risk for multiple comparisons | 
Stephen E. Chick, Koichiro Inoue 

December 1998 Proceedings of the 30th conference on Winter simulation 

Full text available: 1^pdf( 116.47 KB) Additional Information: full citation , references , citings. Index terms 



53 Technical correspondence 
CORPORATE Tech Correspondence 

November 1985 Communications of the ACM, volume 28 issue ii 

Full text available: ^ pdf ( 1.03 MB) Additional Information: full citation , references , citings , index terms 



54 An empirical evaluation of several methods to select the best system 
Koichiro Inoue, Stephen E. Chick, Chun-Hung Chen 

October 1999 ACM Transactions on Modeling and Computer Simulation (TOMACS), 

Volume 9 Issue 4 

f- ^ •. ui 0 ^f/Arsr^'7A Additional Information: full citation, abstra ct, references, citings, index 

Full text available: TO pdf(1 99.74 KB) ^ "~~ 

^ terms 

Simulation is an Important tool for comparing the performance of several alternative 
systems. There is therefore significant interest in procedures that efficiently select the best 
system, where best Is defined by the maximum or minimum expected simulation output. In 
this paper, we examine both two-stage and sequential procedures that represent three 
structurally different modeling methodologies for allocating simulation replications to 
identify the best system, and we evaluate them empiric ... 

Keywords: discrete-event simulation, multiple selection procedures, ranking and selection 



55 The Advanta g es of Compromis i n g in Coalition Formation with Inco m plete Information Q 
Sarlt Kraus, Onn Shehory, Gilad Taase 

July 2004 Proceedings of the Third International Joint Conference on Autonomous 
Agents and Multiagent Systems - Volume 2 

Full text available: ^ pdf(338.65 KB) Additional Information: full citation , abstract , index terms 

This paper presents protocols and strategies for coalition formation with incomplete 
information under time constraints. It focuses on strategies for coalition members to 
distribute revenues amongst themselves. Such strategies should preferably be stable, lead 
to a fair distribution, and maximize the social welfare of the agents. These properties are 
only partially supported by existing coalition formation mechanisms. In particular, stability 
and the maximization of social welfare are supported ... 



56 Selectivity estimation using probabilistic models 
Use Getoor, Benjamin Taskar, Daphne Koller 

May 2001 ACM SIGMOD Record , Proceedings of the 2001 ACM SIGMOD international 

conference on Management of data, volume 30 issue 2 

I- II * ^ •> ui 0 ^*/coc 7A i^D\ Additional Information: full citation , abstract , references , citings, index 

Full text available: TO pdf(525.74 KB) ^ 

terms 

Estimating the result size of complex queries that involve selection on multiple attributes 
and the join of several relations is a difficult but fundamental task in database query 
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processing. It arises in cost-based query optimization, query profiling, and approximate 
query answering. In this paper, we show how probabilistic graphical models can be 
effectively used for this task as an accurate and compact approximation of the joint 
frequency distribution of multiple attributes across multiple ... 

57 Robust probabilistic inference in distributed systems B 
Marl< A. Paskin, Carlos E. Guestrin 

July 2004 Proceedings of the 20th conference on Uncertainty in artificial intelligence 

Full text available: Q pdf( 524.33 KB) Additional Information: full citation , abstract , references 

Probabilistic inference problems arise naturally in distributed systems such as sensor 
networks and teams of mobile robots. Inference algorithms that use message passing are a 
natural fit for distributed systems, but they must be robust to the failure situations that 
arise in real-world settings, such as unreliable communication and node failures. 
Unfortunately, the popular sum— product algorithm can yield very poor estimates in these 
settings because the nodes' beliefs before convergence can ... 

Session 2: An economic answer to unsolicited coinmunication Q 
Thede Loder, Marshall Van Alstyne, Rick Wash 

May 2004 Proceedings of the 5th ACM conference on Electronic commerce 

Full text available: ^ pdf(352.80 KB) Additional Information: full citation , abstract , references , index terms 

We explore an alternative approach to spam based on econonnic rather than technological 
or regulatory screening nnechanisms. We employ a model of email value which supports two 
intuitive notions: 1) mechanisms designed to promote valuable communication can often 
outperform those designed merely to block wasteful communication, and 2) designers of 
such mechansisms should shift focus away from the information in the message to the 
information known to the sender. We then use principles of informatio ... 

Keywords: filtering, information asymmetry, mechanism design, screening, signaling, 
spam, uce 

59 Special issue on learning from imbalanced datasets: Minority report in fraud detection: Q 

classification of skewed data 

Clifton Phua, Damminda Alahakoon, Vincent Lee 

June 2004 ACM SIGKDD Explorations Newsletter, volume 6 issue i 

Full text available: ^ pdf(262.38 KB ) Additional Information: f ull cit a t i on, abstract , reference s , citin gs 

This paper proposes an innovative fraud detection method, built upon existing fraud 
detection research and Minority Report, to deal with the data mining problem of skewed 
data distributions. This method uses backpropagation (BP), together with naive Bayesian 
(NB) and C4.5 algorithms, on data partitions derived from minority oversampling with 
replacement. Its originality lies in the use of a single meta-classifier (stacking) to choose 
the best base classifiers, and then combine these base ... 

Keywords: fraud detection, meta-learning, multiple classifier systems 



60 Information access and retrieval: Evaluating cost-sensitive Unsolicited Bulk Ennail 
cate g orization 
Jose Maria Gomez Hidalgo 

March 2002 Proceedings of the 2002 ACM symposium on Applied computing 

r- M. ^ •. u. 0 i^Dx Additional Information: full citation , abstract , references , citin gs, index 

Full text available: ^ pdf(566.16 KB) terms 
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In the recent years. Unsolicited Bulk Email has became an increasingly important problem, 
with a big economic impact. In this paper, we discuss cost-sensitive Text Categorization 
methods for UBE filtering. In concrete, we have evaluated a range of Machine Learning 
methods for the task (C4.5, Naive Bayes, PART, Support Vector Machines and Rocchio), 
made cost sensitive through several methods (Threshold Optimization, Instance Weighting, 
and Meta-Cost). We have used the Receiver Operating Character ... 

Keywords: cost-sensitive classification, evaluation, text categorization, unsolicited bulk 
email 
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